Advanced Bookkeeping

Non-Markovian Accumulators, Transition States, and Backwards Conversion

Learning Objectives

  • Establish the transition rate matrix as the central “hub” of the modeling process.
  • Include transition states and accumulators to accurately track and count events, costs, and QALYs.
  • Backwards convert an existing Markov model to facilitate new health states, new strategies, or new settings.

Review of Key Concepts

Transition Rate Matrix

  • The central “hub” of a Markov model.
  • Straightforward to convert rate matrix into a transition probability matrix.
  • Facilitates modeling using alternative techniques:
    • Continuous time Markov
    • Discrete event simulation

Transition Rate Matrix

  • The “workshop” where you can include additional health states, new evidence, change cycle lengths, etc.
  • Ensures that results obtained via discrete time Markov model will match those under an identical model but different method (e.g., DES).

What they didn’t teach you

  • The transition rate matrix is also essential for accurate and transparent accounting of costs and QALYs.

What we’ll teach you

  • We’ll demonstrate how to augment the matrix with non-Markovian elements using the basic CVD model from the last session.
  • We’ll show you how to backwards convert an existing transition probability matrix into a continuous generator (rate) matrix.
  • We’ll then replicate and extend a recent didactic model (Green et al. 2023) in an applied case study using these methods.

CVD Model

CVD Model

  • Let’s quickly build up the basic CVD model again.

CVD Model

G Healthy Healthy Healthy->Healthy CVD CVD Healthy->CVD r_H_CVD = 0.15 Dead Dead Healthy->Dead r_HD=0.01 CVD->CVD CVD->Dead hr_CVD * r_HD hr_CVD=10 Dead->Dead

Parameterize

params = 
  list(
    t_names = c("natural_history"),           # Strategy names. 
    n_treatments = 1,                         # Number of treatments
    s_names  = c("Healthy", "CVD", "Dead"),   # State names
    n_states = 3,                             # Number of states
    n_cohort = 1,                             # Cohort size
    n_cycles = 100,                           # Number of cycles in model.  
    cycle = 1,                                # Cycle length
    initial_age = 55,                         # Cohort starting age
    r_H_CVD = 0.15,                           # Rate of healthy -> CVD
    hr_CVD = 10,                              # Hazard Ratio: CVD Death
    r_H_D = 0.01                              # Rate of healthy -> dead
  )

Transition Rate Matrix

params$mR <- 
  with(params,{
    r_CVD_D <- hr_CVD * r_H_D
    R_ <- 
      array(data = c(0, 0, 0,  
                 r_H_CVD,0, 0,
                 r_H_D,r_H_D + r_CVD_D,0,
                 r_H_D, 0,0),
          dim = c(n_states, n_states, n_treatments),
                  dimnames = list(from = s_names,
                                  to = s_names,
                                  t_names))
    R <- apply(R_,3, simplify=FALSE,function(x) {
      diag(x) = -rowSums(x)
      x * cycle
    })
  R    
  })

Transition Rate Matrix

params$mR
$natural_history
         to
from      Healthy   CVD Dead
  Healthy   -0.16  0.15 0.01
  CVD        0.00 -0.11 0.11
  Dead       0.00  0.00 0.00

Transition Probability Matrix

expm(params$mR[["natural_history"]])
        Healthy     CVD     Dead
Healthy 0.85214 0.13107 0.016785
CVD     0.00000 0.89583 0.104166
Dead    0.00000 0.00000 1.000000

Advanced Bookkeeping

  1. Given compound transitions, how can we track the total number of people who develop CVD?
  2. What if there is a cost associated with a single transition type (e.g., CVD CVD death)?
  3. What if we want to include a tunnel state to capture transient utility changes after contracting CVD or experiencing an acute event?

Advanced Bookkeeping

Augmenting the transition matrix with non-markovian elements can address each question:

  1. Accumulator states to count total number of people who enter a health state.
  2. Transition states to count the number of people who transition into a state in a given cycle.
  3. Tunnel states to capture transient events (e.g., 2 cycle utility decrement, etc.)

Markovian Transition Matrices

  • Traditional Markov modeling approaches restrict to a transition probability matrix.
  • Row sums all equal 1.0.

Markovian Transition Matrices

  • Traditional Markov modeling approaches restrict to a transition probability matrix.
  • Row sums all equal 1.0.
Healthy CVD Dead
Healthy 0.852 0.131 0.017
CVD 0.000 0.896 0.104
Dead 0.000 0.000 1.000

Non-Markovian Elements

  • Augmenting the transition matrix with non-markovian rows and columns can facilitate accurate counting and bookkeeping.
  • Depending on the objective, can include non-markovian accumulators and/or transition states.

Non-Markovian Elements

  • Accumulator: tracks the total number of individuals who have entered a given state up until a given cycle (even if they moved out of the state later).

Non-Markovian Elements

  • Transition state: tracks the total number of individuals who enter a given state in a given cycle.
    • Can construct as a tunnel state if you move them through to the next state (or exit for some other reason like background mortality).
    • But you don’t have to do this. Depending on objective, you could just count the total number who enter the state to count up transitory costs, etc.

Where we’re headed

Healthy CVD Dead trCVD accCVD
Healthy 0.852 0.131 0.017 . .
CVD 0 0.896 0.104 . .
Dead 0 0 1 . .
trCVD . . . . .
accCVD . . . . .

Augmenting the Transition Matrix

Augmenting the Transition Matrix

  • As noted earlier, all roads begin from the transition rate matrix.
  • The way in which we construct the Markovian components is the same.
    • Place rates as appropriate.
    • Diagonal of Markovian elements is the negative sum of the other row elements.

Augmenting the Transition Matrix

  • Rate matrix for basic CVD model
Healthy CVD Dead
Healthy -0.16 0.15 0.01
CVD 0.00 -0.11 0.11
Dead 0.00 0.00 0.00

Total CVD Cases

  • Suppose we wanted to track the total number of people who develop CVD over the simulated time horizon.
  • We can’t do this accurately using CVD state occupancy in the Markov trace due to compound transitions.

Total CVD Cases

  • We need some way to track everyone who moves into the CVD state.
  • We can do this by adding a non-markovian accumulator to the rate matrix.

1. Accumulators

Non-Markovian Accumulators

We’ll start with the original transition rate matrix:

R_ <- params$mR[["natural_history"]]
R_
         to
from      Healthy   CVD Dead
  Healthy   -0.16  0.15 0.01
  CVD        0.00 -0.11 0.11
  Dead       0.00  0.00 0.00

Non-Markovian Accumulators

  • Next, we will add both a column and a row for the accumulator that tracks movement into the CVD health state.
  • For now, just include zeros.

Non-Markovian Accumulators

  • Next, we will add both a column and a row for the accumulator that tracks movement into the CVD health state.

  • For now, just include zeros.

R <- 
  cbind(R_,"accCVD" = c(0,0,0)) %>% 
  rbind(.,"accCVD" = c(0,0,0,0))  
R
        Healthy   CVD Dead accCVD
Healthy   -0.16  0.15 0.01      0
CVD        0.00 -0.11 0.11      0
Dead       0.00  0.00 0.00      0
accCVD     0.00  0.00 0.00      0

Non-Markovian Accumulators

  • Next fill in the appropriate transition rate.
R["Healthy","accCVD"] = params$r_H_CVD
R
        Healthy   CVD Dead accCVD
Healthy   -0.16  0.15 0.01   0.15
CVD        0.00 -0.11 0.11   0.00
Dead       0.00  0.00 0.00   0.00
accCVD     0.00  0.00 0.00   0.00

Non-Markovian Accumulators

  • Embed using matrix exponentiation to obtain the transition probability matrix.
  • The transition matrix now includes a Markovian submatrix (green) and a non-markovian accumulator component (purple).
expm(R)  
Healthy CVD Dead accCVD
Healthy 0.852 0.131 0.017 0.139
CVD 0 0.896 0.104 0
Dead 0 0 1 0
accCVD 0 0 0 1

Non-Markovian Accumulators

  • Note that the Markovian submatrix still has rows that sum to one.
  • Also note that the accumulator captures the same transition probability that was (incorrectly) placed in the Healthy CVD cell in lecture 2!
    • Recall that this probability is the marginal probability of transition (i.e., all transitions into CVD state). So it’s exactly what we want to accumulate!

How Many with CVD After 1 Years?

  • If we simply went based on state occupancy after one cycle, we’d understate the number of people who develop CVD.
    • This is the number who remain in the CVD health state.
    • Some had a compound transition and died within the cycle after they developed CVD!
cycle Healthy CVD Dead accCVD
0 100000 0 0.0 0
1 85214 13107 1678.5 13862

How Many with CVD After 1 Years?

  • The accumulator column tells us how many total people developed CVD.
cycle Healthy CVD Dead accCVD
0 100000 0 0.0 0
1 85214 13107 1678.5 13862

2. Transition States

Transition States

  • What if we need to count how many people transition to a given state in a cycle?
  • For example, we may want to apply a one-time cost to a transition such as CVD CVD death.
  • This type of transition is not immediately captured in the Markov trace.
    • Observed cycle-to-cycle change in death counts capture CVD CVD Death as well as death from background causes.

Transition States

  • Transition states are straightforward to capture as another element in the transition matrix.
  • Similar to an accumulator state, but we require state exit after one cycle.
  • Amounts so simply zeroing out the accCVD accCVD transition probability.

Transition States

We’ll start with the original transition rate matrix:

R_ <- params$mR[["natural_history"]]
R_
         to
from      Healthy   CVD Dead
  Healthy   -0.16  0.15 0.01
  CVD        0.00 -0.11 0.11
  Dead       0.00  0.00 0.00

Transition States

  • Next, we will add both a column and a row for the transition state that tracks movement into the CVD Death state.

Transition States

  • Next, we will add both a column and a row for the transition state that tracks movement into the CVD Death state.
  • For now, just include zeros.
R <- 
  cbind(R_,"trCVDDeath" = c(0,0,0)) %>% 
  rbind(.,"trCVDDeath" = c(0,0,0,0))  
R
           Healthy   CVD Dead trCVDDeath
Healthy      -0.16  0.15 0.01          0
CVD           0.00 -0.11 0.11          0
Dead          0.00  0.00 0.00          0
trCVDDeath    0.00  0.00 0.00          0

Transition States

  • We then fill in the appropriate transition rate.
R["CVD","trCVDDeath"] = params$hr_CVD * params$r_H_D
R
           Healthy   CVD Dead trCVDDeath
Healthy      -0.16  0.15 0.01        0.0
CVD           0.00 -0.11 0.11        0.1
Dead          0.00  0.00 0.00        0.0
trCVDDeath    0.00  0.00 0.00        0.0

Transition States

  • Embed using matrix exponentiation to obtain the transition probability matrix.
  • Note that we need to manually zero out the transition probability that keeps people in the transition state.
P = expm(R)  
P["trCVDDeath","trCVDDeath"] = 0
P
           Healthy     CVD     Dead trCVDDeath
Healthy    0.85214 0.13107 0.016785  0.0068583
CVD        0.00000 0.89583 0.104166  0.0946962
Dead       0.00000 0.00000 1.000000  0.0000000
trCVDDeath 0.00000 0.00000 0.000000  0.0000000

Transition States

Healthy CVD Dead trCVDDeath
Healthy 0.852 0.131 0.017 0.007
CVD 0.000 0.896 0.104 0.095
Dead 0.000 0.000 1.000 0.000
trCVDDeath 0.000 0.000 0.000 0.000
  • Note that the Markovian submatrix still has rows that sum to one.
  • Also note that the transition state column captures transitions to death that occur from both the Healthy state (compound transitions) and the CVD state!

How Many CVD Death Transitions Are There?

  • Hypothetical cohort of 100,000.
  • Note how there are some deaths in the first cycle, despite everyone starting from healthy!
cycle Healthy CVD Dead trCVDDeath
0 100000 0 0.0 0.00
1 85214 13107 1678.5 685.83

Incorporating a one-time cost

  • Can define a cost “payoff” for trCVDDeath and apply as usual to calculate accumulated costs from a one-time CVD death cost.
# Define the cost vector
cost_payoff = c("Healthy" = 0, "CVD" = 500, "Dead" = 0, "trCVDDeath" = 2000) 

# Sum up total costs in each cycle
costs_cycle = trace[["natural_history"]][-1,] %*% cost_payoff

# Function for cycle correction (alternative Simpson's method)
alt_simp_coef <- function(i) c(17, 59, 43, 49, rep(48, i-8), 49, 43, 59, 17) / 48
cycle_adj      <- function(x) sum(alt_simp_coef(length(x)) * x)

total_costs = cycle_adj(costs_cycle) 
total_costs
[1] 5926.6

3. Tunnel States

Tunnel States

  • Tunnel states are structured similarly to accumulators and transition states.
  • One additional complication: need to allow for tunnel state exit to background mortality (or other exit states).

Accumulators and Transition States

  • Occupy both ends of a spectrum.
  • Accumulators: probability of remaining in accumulator state = 1.0.
  • Transition States: probability of remaining in transition state = 0.0.

Tunnel States

  • Probability of remaining in tunnel: 0 < p < 1.
  • Reflects the idea that most will remain in the tunnel for the cycle, but some might leave due to secular death, transition back to a less severe disease state (e.g., remission), etc.

Tunnel States

  • Process for constructing is similar, with a “catch.”
  • We are already accounting for transitions to death in the primary Markovian submatrix.
  • But for accurate bookkeeping purposes (e.g., QALYs and costs) we still need to net them out.

Tunnel States

  • We’ll do this by including a temporary “receiving bucket” in the rate matrix.
  • The receiving bucket is used for embedding, but can be discarded for the final transition probability matrix.

Tunnel States

  • Suppose we want to record a temporary 2 year (cycle) initial treatment cost once someone develops CVD.
  • We can set up a tunnel state to capture this.

Tunnel State

We’ll start with the original transition rate matrix:

$natural_history
         to
from      Healthy   CVD Dead
  Healthy   -0.16  0.15 0.01
  CVD        0.00 -0.11 0.11
  Dead       0.00  0.00 0.00
R_ <- params$mR[["natural_history"]]
R_
         to
from      Healthy   CVD Dead
  Healthy   -0.16  0.15 0.01
  CVD        0.00 -0.11 0.11
  Dead       0.00  0.00 0.00

Tunnel State

  • Next, we will add columns and rows for the tunnel states.
  • Given that our model has an annual cycle, and the tunnel lasts two years, we need to add two columns and two rows PLUS a receiving bucket. So three new rows and columns.
  • If we had a monthly time cycle, we’d need to add 25 …

Tunnel States

  • For now, just include zeros.
R <- 
  cbind(R_,"tunCVDy1" = c(0,0,0), "tunCVDy2" = c(0,0,0),"NULL" = c(0,0,0)) %>% 
  rbind(.,"tunCVDy1" = c(0,0,0,0,0,0), "tunCVDy2" = c(0,0,0,0,0,0), "NULL" = c(0,0,0,0,0,0))  
R
         Healthy   CVD Dead tunCVDy1 tunCVDy2 NULL
Healthy    -0.16  0.15 0.01        0        0    0
CVD         0.00 -0.11 0.11        0        0    0
Dead        0.00  0.00 0.00        0        0    0
tunCVDy1    0.00  0.00 0.00        0        0    0
tunCVDy2    0.00  0.00 0.00        0        0    0
NULL        0.00  0.00 0.00        0        0    0

Tunnel State

  • We then fill in the appropriate transition rate for entry into the tunnel.
R["Healthy","tunCVDy1"] = R["Healthy","CVD"]
R
         Healthy   CVD Dead tunCVDy1 tunCVDy2 NULL
Healthy    -0.16  0.15 0.01     0.15        0    0
CVD         0.00 -0.11 0.11     0.00        0    0
Dead        0.00  0.00 0.00     0.00        0    0
tunCVDy1    0.00  0.00 0.00     0.00        0    0
tunCVDy2    0.00  0.00 0.00     0.00        0    0
NULL        0.00  0.00 0.00     0.00        0    0

Tunnel State

  • Embed using matrix exponentiation to obtain the initial transition probability matrix.
  • Note that we need to TK
P_ = expm(R)  
P_["tunCVDy1","NULL"] = P_["CVD","Dead"]
P_["tunCVDy1","tunCVDy1"] = 0
P_["tunCVDy1","tunCVDy2"] = 1 - P_["tunCVDy1","NULL"]
P_["tunCVDy2","tunCVDy2"] = 0
P_["tunCVDy2","NULL"] = 1
P = P_[-which(rownames(P_)=="NULL"),-which(colnames(P_)=="NULL")]
P
         Healthy     CVD     Dead tunCVDy1 tunCVDy2
Healthy  0.85214 0.13107 0.016785  0.13862  0.00000
CVD      0.00000 0.89583 0.104166  0.00000  0.00000
Dead     0.00000 0.00000 1.000000  0.00000  0.00000
tunCVDy1 0.00000 0.00000 0.000000  0.00000  0.89583
tunCVDy2 0.00000 0.00000 0.000000  0.00000  0.00000

Transition States

Healthy CVD Dead tunCVDy1 tunCVDy2
Healthy 0.852 0.131 0.017 0.139 0.000
CVD 0.000 0.896 0.104 0.000 0.000
Dead 0.000 0.000 1.000 0.000 0.000
tunCVDy1 0.000 0.000 0.000 0.000 0.896
tunCVDy2 0.000 0.000 0.000 0.000 0.000
  • Note that the Markovian submatrix still has rows that sum to one.

Probabilities to Rates

  • Inverse problem exists of having a transition probability model specified and needing the rates.
  • The rate matrix does not always exist, and this is the usual case.
  • Cause: Statistical error in observation, jump over states.
  • Two methods exist, eigenvalue method and Higham method.

HIV Model

      A     B     C     D
A 0.721 0.202 0.067 0.010
B 0.000 0.581 0.407 0.012
C 0.000 0.000 0.750 0.250
D 0.000 0.000 0.000 1.000
  • Original paper specified probabilities observed.
  • A (Healthly) -> B (cd4 < 200) -> C (AIDS) -> D (Death)
  • Note there are “jumpover” transitions.
  • From ‘Decision Modelling for Health Economic Evaluation’ by Briggs, Claxton, Sculpher.

Eigenvalue Method

V  <- eigen(m_P)$vectors
iV <- solve(V)
Ap <- iV %*% m_P %*% V
lAp <- diag(log(diag(Ap)), nrow(Ap), ncol(Ap))
R  <- V %*% lAp %*% iV
R[abs(R) < 1e-6 ] <- 0
rownames(R) <- c("A", "B", "C", "D")
colnames(R) <- c("A", "B", "C", "D")
round(R,3)
       A      B      C      D
A -0.327  0.311  0.002  0.013
B  0.000 -0.543  0.615 -0.072
C  0.000  0.000 -0.288  0.288
D  0.000  0.000  0.000  0.000
  • Note there is a negative rate from B -> D.
  • This is a transition from D -> B and should be moved to the other side of the matrix.

Eigenvalue Method (cont.)

       A      B      C      D
A -0.327  0.311  0.002  0.013
B  0.000 -0.543  0.615 -0.072
C  0.000  0.000 -0.288  0.288
D  0.000  0.000  0.000  0.000
  • The negative rate implies an identifiable issue.
  • Note rates from A->C, A->D are relatively small, implying these are from statistical noise in observation.
  • expm::logm Higham method returns same as Eigenvalue method.
  • Tweaking rates to get a proper model is ad-hoc and difficult.

Reestimating Rates

  • Imposition of structural assumptions and fitting to raw data would likely result in better rate estimates.
  • We could assume that a patient with HIV at this point in time only gets sicker and external causes of death are negligible. A->B->C->D constrains model.
  • We use the reported data from the paper at year 1
  • MSM Method: https://chjackson.github.io/msm/msmcourse/

Reestimating Rates (MSM Method)

library(msm)
x <- rbind(
  # Start in state 1 at time 0.
  data.frame(
    ptnum=1:1000, years=rep(0, 1000), state=rep(1, 1000)
  ),
  # Observed at year 1 new state.
  data.frame(
    ptnum=1:1000, years=rep(1, 1000),
    state=c(rep(1, 721), rep(2, 202),
            rep(3, 67),  rep(4, 10))
  )
)
x <- x[order(x$ptnum, x$years),]

Reestimating Rates (Structural Assumption)

statetable.msm(state, ptnum, data=x)
    to
from   1   2   3   4
   1 721 202  67  10
Q.init <- rbind(rbind(c(0, 0.3, 0,   0),    # A -> B only
                      c(0, 0,   0.6, 0),    # B -> C only
                      c(0, 0,   0,   0.3),  # C -> D only
                      c(0, 0,   0,   0)))
  • Q represents structural assumptions and estimate
  • Estimates were taken from Eigenvalue method
  • Relaxing structural assumptions does not converge

Reestimating Rates (Likelihood Fit)

hiv.msm <- msm(state ~ years, subject = ptnum, data = x, qmatrix = Q.init)
hiv.msm

Call:
msm(formula = state ~ years, subject = ptnum, data = x, qmatrix = Q.init)

Maximum likelihood estimates

Transition intensities
                  Baseline                 
State 1 - State 1 -0.3271 (-0.3680,-0.2907)
State 1 - State 2  0.3271 ( 0.2907, 0.3680)
State 2 - State 2 -0.6454 (-0.8181,-0.5092)
State 2 - State 3  0.6454 ( 0.5092, 0.8181)
State 3 - State 3 -0.3982 (-0.7562,-0.2097)
State 3 - State 4  0.3982 ( 0.2097, 0.7562)

-2 * log-likelihood:  1572.2 
  • All positive.
  • Provides 95% confidence intervals for a PSA.

Reestimating Rates (Transition Probabilities)

m_P
      A     B     C     D
A 0.721 0.202 0.067 0.010
B 0.000 0.581 0.407 0.012
C 0.000 0.000 0.750 0.250
D 0.000 0.000 0.000 1.000
round(pmatrix.msm(hiv.msm, t=1), 3)
        State 1 State 2 State 3 State 4
State 1   0.721   0.202   0.067   0.010
State 2   0.000   0.524   0.384   0.092
State 3   0.000   0.000   0.672   0.328
State 4   0.000   0.000   0.000   1.000
  • Quite close.
  • Problematic transition from B -> D probability is higher.

Reestimating Rates (Transition Probabilities)

c(1000, 0, 0, 0) %*% m_P
       A   B  C  D
[1,] 721 202 67 10
c(1000, 0, 0, 0) %*% pmatrix.msm(hiv.msm, t=1)
     State 1 State 2 State 3 State 4
[1,]     721     202      67      10
  • Resulting expectation after 1 cycle is nearly identical.
  • The msm method has bits down at 4th decimal. Round off made them identical.
  • Only a tiny bit of error (721 vs. 721.0002) makes log numerically unstable converting from probability to rate.

Conclusions

  • Identifying rate matrix from transition probability matrix in presence of error/jump over is in general not possible.
  • The matrix log methods (eigen/Higham) usually results in impossible models.
  • Using observed data one can fit rate data using likelihood, e.g. msm.
  • Structural assumptions can be imposed and may be necessary.
  • 95% confidence intervals result.